I’ve been using C++ for many years, but it is such an incredibly rich and complex language that I still learn new things about it from time to time. In this article, I would like to point out five of the more surprising features that I’ve discovered, both for the “Wow, I had no idea!” factor as well as because some of them have proven useful to me.
Did you know that a case label doesn’t have to appear immediately inside a switch statement? It can appear at a deeper level, such as inside a loop that appears inside the switch statement. That means that it is possible to jump into the middle of a for loop! This kind of unstructured flow control is normally considered poor style, but there are some cases where it has advantages. The original and most famous use is Duff’s Device, a method of handling the remainder when doing loop unrolling. It has also been used to implement state-machine-like functions that restart from where they left off (by putting a case label immediately after each return statement, and saving the index just before returning). I used this in the beta Marathon Match (CostlySorting), in which your function provides queries to the system in its return value, with replies being passed to the next iteration of the function. There are, however, some serious limitations to the technique: non-static local variables get clobbered by the restart, extra work is required to make the function re-entrant, and you cannot return from an inner case statement.
Technically this one is more a feature of the C preprocessor than the C++ language, but since the two are so closely linked I thought it worth including. Have you ever wondered what happens if you use the name of a macro in the definition of that macro? For example:
Clearly, any attempt to expand such a macro recursively will cause an infinite loop, since there is no way to terminate the expansion at the chosen point. Thus, you may be forgiven for thinking that this leads to a preprocessor error. In fact, if a macro occurs in its own expansion (either directly or via another macro expansion), it is not expanded again but rather left as is.
This has practical applications in writing transparent wrappers. Suppose you’ve implemented a solution to a TopCoder problem, and it is going wrong somewhere. You suspect that you’re passing a negative value to one of many calls to sqrt. A simple way to test for this without modifying any of your sqrt calls is to put the following at the top of your file:#define sqrt(x) (assert((x) >= 0.0), sqrt(x))
Compared to using a wrapper function, this has the advantage that the assert will report the line number of the actual call site, rather than the line number in the wrapper function.
Templates should be old hat to seasoned C++ programmers, especially if they are proficient with the STL. But how often do you see code that looks like this?template<typename T, template<typename T> typename I> class Vector { ... }
What the heck does a template class as a template parameter mean? I’ll explain it in terms of a problem I had that I solved in this way. I wanted to write a vector class (in the mathematical sense of the word “vector”), with template parameters to say what internal storage type to use (e.g., float or double), and also an inner product operator (a dot product is a particular instance of an inner product). So I’d write a function object that implemented an inner product such as the dot product, and of course templatize it:template<typename T> class DotProduct { ... }
Here T is the underlying storage type, not the class of the Vector, since that would lead to cyclic dependencies later. Now, I want to make the type of the inner product function object a template parameter to the Vector class. However, if I just make it an ordinary template parameter, then nothing enforces the constraint that the storage class for the Vector itself and for the inner product object must be the same (other than some very obscure compilation errors). The templatized template parameter indicates that instead of taking a fully specialized class as the parameter, it takes a templatized class with the specified parameters; in this case, a single type parameter. I can then instantiate a Vector as Vector<double, DotProduct> v. Within the implementation of Vector, I
use I<T>
to refer to the particular inner product class that I want to use. In the example instantiation, that will resolve to DotProduct<double>
.
As another example, GCC’s implementation of valarray has some extremely complicated use of template template parameters in order to evaluate expressions efficiently without redundant copies.
Pointers are the bread and butter of C and C++. But C++ introduces a different sort of pointer (which is not interchangeable with a normal pointer), called the pointer to member. It points to a member in a class — note, not a particular instance of a class, but rather the class itself. For example, given the definition
1 2 3 4 5
class A { public: int x, y; int f(void * p); };
&A::x, &A::y
are pointers to data members and &A::f is a pointer to member function. The first two are of type int A::* while the last is of type int (A::*)(void *)
. Note how this compares to normal pointers: the * is replaced by A::* to indicate that this is a pointer to a member of A.
These pointers are in a sense incomplete, because they don’t reference any particular instance of A. The instance is provided when you dereference the pointers, using the ->* or .* operator. Given the definitions
1 2 3
A a, * aptr; int A:: * aintptr; int(A:: * afuncptr)();
One can write a.*aintptr, aptr->*intptr, (a.*afuncptr)() or (aptr->*funcptr)()
.
For pointers to member functions, the compiler handles virtual functions correctly and will call the version from the appropriate class. As a result, most compilers (including GCC) will use more bytes for a pointer to member function than for a normal pointer, to keep track of necessary information. It is thus perhaps not surprising that the standard is particularly strict about conversions of pointers to members. For example, you can’t convert everything to and from a void pointer as one can with normal pointers.
That’s all very well, but what use is it? As an example, I have used them in an algorithm that did some processing on a 3D mesh. Each face of the mesh has edges, which can be walked clockwise or anti-clockwise using Edge::prev() and Edge::next() member functions. The algorithm in question needed to process a face twice, once clockwise and once anti-clockwise. Rather than duplicating the code or littering it with conditionals, I used a pointer to member function that pointed to either Edge::next or Edge::prev, depending on the direction that the algorithm wished to walk.
Be aware that this isn’t necessarily the most efficient way to do it, since calling any function pointer is general unfriendly to branch prediction, and a pointer to member function has extra complications due to handling virtual functions. It is a very maintainable approach though, particularly if you use some abstraction to hide the uglier pieces of syntax. The STL has a wrapper called mem_fun, which accepts a pointer to member function and wraps it in a function object that can be called with the object on which to apply the pointer to member function. Unfortunately it only works for member functions that take no arguments, but you can apply the same idea in your own code.
C++ allows references to an object to be marked as constant. This promotes good programming practice, since you can declare a function parameter to be a constant reference that is self-documenting (anyone reading the function description will know that the function won’t clobber his/her object), and any accidental modifications will be rejected by the compiler.
However, in some cases, what appears at the language level to be a modification is actually not a modification at a higher semantic level. The best example of this is any caching technique (such as memoization). If a value is entered into a cache, the object will not behave any differently (apart from it being faster to compute the same value again), and thus it hasn’t really changed. Here, the mutable keyword can help. It can be applied to a data member in a class, and indicates that the member may be modified even when the object is being used as a constant. In our caching example, it would be applied to the cache to allow a query function to update it.
Watch for a follow-up article coming soon, featuring five things you didn’t know about initialization and destruction in C++.